- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources4
- Resource Type
-
0002100001000000
- More
- Availability
-
31
- Author / Contributor
- Filter by Author / Creator
-
-
Kabra, Anmol (4)
-
Bjorck, Johan (2)
-
Gomes, Carla (2)
-
Weinberger, Kilian Q. (2)
-
Gomes, Carla P (1)
-
Gong, Albert (1)
-
Klenke, Julius (1)
-
Lee, Johann (1)
-
Li, Gene (1)
-
Li, Junbo (1)
-
Srebro, Nati (1)
-
Stankeviciute, Kamile (1)
-
Thesmar, Raphael (1)
-
Wan, Chao (1)
-
Wang, Zhaoran (1)
-
Weinberger, Kilian Q (1)
-
Yang, Zhuoran (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
High-quality benchmarks are essential for evaluating reasoning and retrieval capabilities of large language models (LLMs). However, curating datasets for this purpose is not a permanent solution as they are prone to data leakage and inflated performance results. To address these challenges, we propose PhantomWiki: a pipeline to generate unique and factually consistent document corpora with diverse question-answer pairs. Unlike prior work, PhantomWiki is neither a fixed dataset, nor is it based on any existing data. Instead, a new PhantomWiki instance is generated on demand for each evaluation. We vary the question difficulty and corpus size to disentangle reasoning and retrieval capabilities respectively, and find that PhantomWiki datasets are surprisingly challenging for frontier LLMs. Thus, we contribute a scalable and data leakage-resistant framework for disentangled evaluation of reasoning, retrieval, and tool-use abilities.more » « lessFree, publicly-accessible full text available July 16, 2026
-
Li, Gene; Li, Junbo; Kabra, Anmol; Srebro, Nati; Wang, Zhaoran; Yang, Zhuoran (, Advances in neural information processing systems)
-
Bjorck, Johan; Kabra, Anmol; Weinberger, Kilian Q.; Gomes, Carla (, Proceedings of the AAAI Conference on Artificial Intelligence)Non-negative matrix factorization (NMF) is a highly celebrated algorithm for matrix decomposition that guarantees non-negative factors. The underlying optimization problem is computationally intractable, yet in practice, gradient-descent-based methods often find good solutions. In this paper, we revisit the NMF optimization problem and analyze its loss landscape in non-worst-case settings. It has recently been observed that gradients in deep networks tend to point towards the final minimizer throughout the optimization procedure. We show that a similar property holds (with high probability) for NMF, provably in a non-worst case model with a planted solution, and empirically across an extensive suite of real-world NMF problems. Our analysis predicts that this property becomes more likely with growing number of parameters, and experiments suggest that a similar trend might also hold for deep neural networks---turning increasing dataset sizes and model sizes into a blessing from an optimization perspective.more » « less
-
Bjorck, Johan; Kabra, Anmol; Weinberger, Kilian Q.; Gomes, Carla (, Proceedings of the AAAI Conference on Artificial Intelligence)
An official website of the United States government

Full Text Available